GRM: Generalized Regression Model for Clustering Linear Sequences
نویسندگان
چکیده
Linear relation is valuable in rule discovery of stocks, such as ”if stock X goes up 1, stock Y will go down 3”, etc. The traditional linear regression models the linear relation of two sequences perfectly. However, if user asks ”please cluster the stocks in the NASDAQ market into groups where sequences have strong linear relationship with each other”, it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new model named GRM (Generalized Regression Model) to gracefully handle the problem of linear sequences clustering. GRM gives a measure, GR, to tell the degree of linearity of multiple sequences without having to compare each pair of them. Our experiments on the stocks in the NASDAQ market mined out many interesting clusters of linear stocks accurately and efficiently using the GRM clustering algorithm.
منابع مشابه
Clustering Protein Sequences with Tailored General Regression Model Technique
Cluster analysis divides data into groups that are meaningful, useful, or both. Analysis of biological data is creating a new generation of epidemiologic, prognostic, diagnostic and treatment modalities. Clustering of protein sequences is one of the current research topics in the field of computer science. Linear relation is valuable in rule discovery for a given data, such as if value X goes u...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملThe OSCAR for Generalized Linear Models
The Octagonal Selection and Clustering Algorithm in Regression (OSCAR) proposed by Bondell and Reich (2008) has the attractive feature that highly correlated predictors can obtain exactly the same coe cient yielding clustering of predictors. Estimation methods are available for linear regression models. It is shown how the OSCAR penalty can be used within the framework of generalized linear mod...
متن کاملAn Application of the Generalized Rectangular Fuzzy Model to Critical Thinking Assessment
The authors apply the Generalized Rectangular Model (GRM) for assessing students’ critical thinking skills. GRM is a variation of the center of gravity (COG) defuzzification technique, which was properly adapted and used by them several times in the past as an assessment method, called here the Rectangular Model (RM). The central idea of the GRM is the “movement” to the left of the rectangles a...
متن کاملA generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کامل